Search CORE

241 research outputs found

On Incomplete XML Documents with Integrity Constraints

Author: Barceló Pablo
Libkin Leonid
Reutter Juan L.
Publication venue
Publication date: 01/01/2010
Field of study

Abstract. We consider incomplete specifications of XML documents in the presence of schema information and integrity constraints. We show that integrity constraints such as keys and foreign keys affect consistency of such specifications. We prove that the consistency problem for incomplete specifications with keys and foreign keys can always be solved in NP. We then show a dichotomy result, classifying the complexity of the problem as NP-complete or PTIME, depending on the precise set of features used in incomplete descriptions.

CiteSeerX

Edinburgh Research Explorer

Bisimulations on data graphs

Author: Abriola Sergio Alejandro
Barceló Pablo
Figueira Diego
Figueira Santiago
Publication venue: 'AI Access Foundation'
Publication date: 01/01/2018
Field of study

Bisimulation provides structural conditions to characterize indistinguishability from an external observer between nodes on labeled graphs. It is a fundamental notion used in many areas, such as verification, graph-structured databases, and constraint satisfaction. However, several current applications use graphs where nodes also contain data (the so called “data graphs”), and where observers can test for equality or inequality of data values (e.g., asking the attribute ‘name’ of a node to be different from that of all its neighbors). The present work constitutes a first investigation of “data aware” bisimulations on data graphs. We study the problem of computing such bisimulations, based on the observational indistinguishability for XPath —a language that extends modal logics like PDL with tests for data equality— with and without transitive closure operators. We show that in general the problem is PSPACE-complete, but identify several restrictions that yield better complexity bounds (CO- NP, PTIME) by controlling suitable parameters of the problem, namely the amount of non-locality allowed, and the class of models considered (graphs, DAGs, trees). In particular, this analysis yields a hierarchy of tractable fragments.Fil: Abriola, Sergio Alejandro. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Investigación En Ciencias de la Computación. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Investigación En Ciencias de la Computacion; ArgentinaFil: Barceló, Pablo. Universidad de Chile; ChileFil: Figueira, Diego. Centre National de la Recherche Scientifique; FranciaFil: Figueira, Santiago. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Investigación En Ciencias de la Computación. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Investigación En Ciencias de la Computacion; Argentin

CONICET Digital

Repositorio Académico de la Universidad de Chile

Context-Free Path Querying with Structural Representation of Result

Author: Afroozeh Ali
Barceló Pablo
Hofman Piotr
Johnstone Adrian
Tomita M.
Publication venue
Publication date: 18/01/2017
Field of study

Graph data model and graph databases are very popular in various areas such as bioinformatics, semantic web, and social networks. One specific problem in the area is a path querying with constraints formulated in terms of formal grammars. The query in this approach is written as grammar, and paths querying is graph parsing with respect to given grammar. There are several solutions to it, but how to provide structural representation of query result which is practical for answer processing and debugging is still an open problem. In this paper we propose a graph parsing technique which allows one to build such representation with respect to given grammar in polynomial time and space for arbitrary context-free grammar and graph. Proposed algorithm is based on generalized LL parsing algorithm, while previous solutions are based mostly on CYK or Earley algorithms, which reduces time complexity in some cases.Comment: Evaluation extende

arXiv.org e-Print Archive

Crossref

Guarded Ontology-Mediated Queries Distributing Over Components

Author: Barceló Pablo
Berger Gerald
Pieris Andreas
Publication venue
Publication date: 01/01/2017
Field of study

Edinburgh Research Explorer

Repositorio Académico de la Universidad de Chile

Separating Automatic Relations

Author: Barceló Pablo
Figueira Diego
Morvan Rémi
Publication venue
Publication date: 02/08/2023
Field of study

We study the separability problem for automatic relations (i.e., relations on finite words definable by synchronous automata) in terms of recognizable relations (i.e., finite unions of products of regular languages). This problem takes as input two automatic relations

R

and

R'

, and asks if there exists a recognizable relation

S

that contains

R

and does not intersect

R'

. We show this problem to be undecidable when the number of products allowed in the recognizable relation is fixed. In particular, checking if there exists a recognizable relation

S

with at most

k

products of regular languages that separates

R

from

R'

is undecidable, for each fixed

k \geq 2

. Our proofs reveal tight connections, of independent interest, between the separability problem and the finite coloring problem for automatic graphs, where colors are regular languages.Comment: Long version of a paper accepted at MFCS 202

arXiv.org e-Print Archive

First-Order Rewritability of Frontier-Guarded Ontology-Mediated Queries

Author: Barceló Pablo
Berger Gerald
Lutz Carsten
Pieris Andreas
Publication venue
Publication date: 01/01/2018
Field of study

Crossref

Edinburgh Research Explorer

Model Interpretability through the Lens of Computational Complexity

Author: Barceló Pablo
Monet Mikaël
Pérez Jorge
Subercaseaux Bernardo
Publication venue
Publication date: 12/11/2020
Field of study

In spite of several claims stating that some models are more interpretable than others -- e.g., "linear models are more interpretable than deep neural networks" -- we still lack a principled notion of interpretability to formally compare among different classes of models. We make a step towards such a notion by studying whether folklore interpretability claims have a correlate in terms of computational complexity theory. We focus on local post-hoc explainability queries that, intuitively, attempt to answer why individual inputs are classified in a certain way by a given model. In a nutshell, we say that a class

\mathcal{C}_1

of models is more interpretable than another class

\mathcal{C}_2

, if the computational complexity of answering post-hoc queries for models in

\mathcal{C}_2

is higher than for those in

\mathcal{C}_1

. We prove that this notion provides a good theoretical counterpart to current beliefs on the interpretability of models; in particular, we show that under our definition and assuming standard complexity-theoretical assumptions (such as P

\neq

NP), both linear and tree-based models are strictly more interpretable than neural networks. Our complexity analysis, however, does not provide a clear-cut difference between linear and tree-based models, as we obtain different results depending on the particular post-hoc explanations considered. Finally, by applying a finer complexity analysis based on parameterized complexity, we are able to prove a theoretical result suggesting that shallow neural networks are more interpretable than deeper ones.Comment: 36 pages, including 9 pages of main text. This is the arXiv version of the NeurIPS'2020 paper. Except from minor differences that could be introduced by the publisher, the only difference should be the addition of the appendix, which contains all the proofs that do not appear in the main tex

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

Hal-Diderot

No Agreement Without Loss: Learning and Social Choice in Peer Review

Author: Barceló Pablo
Duarte Mauricio
Rojas Cristóbal
Steifer Tomasz
Publication venue
Publication date: 03/11/2022
Field of study

In peer review systems, reviewers are often asked to evaluate various features of submissions, such as technical quality or novelty. A score is given to each of the predefined features and based on these the reviewer has to provide an overall quantitative recommendation. However, reviewers differ in how much they value different features. It may be assumed that each reviewer has her own mapping from a set of criteria scores (score vectors) to a recommendation, and that different reviewers have different mappings in mind. Recently, Noothigattu, Shah and Procaccia introduced a novel framework for obtaining an aggregated mapping by means of Empirical Risk Minimization based on

L(p,q)

loss functions, and studied its axiomatic properties in the sense of social choice theory. We provide a body of new results about this framework. On the one hand we study a trade-off between strategy-proofness and the ability of the method to properly capture agreements of the majority of reviewers. On the other hand, we show that dropping a certain unrealistic assumption makes the previously reported results to be no longer valid. Moreover, in the general case, strategy-proofness fails dramatically in the sense that a reviewer is able to make significant changes to the solution in her favor by arbitrarily small changes to their true beliefs. In particular, no approximate version of strategy-proofness is possible in this general setting since the method is not even continuous w.r.t. the data. Finally we propose a modified aggregation algorithm which is continuous and show that it has good axiomatic properties.Comment: preprint submitted to a conferenc

arXiv.org e-Print Archive

Semantic Optimization of Conjunctive Queries

Author: Barceló Pablo
Figueira Diego
Gottlob Georg
Pieris Andreas
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2020
Field of study

This work deals with the problem of semantic optimization of the central class of conjunctive queries (CQs). Since CQ evaluation is NP-complete, a long line of research has focussed on identifying fragments of CQs that can be efficiently evaluated. One of the most general restrictions corresponds to generalized hypetreewidth bounded by a fixed constant k ≥ 1; the associated fragment is denoted GHWk. A CQ is semantically in GHWk if it is equivalent to a CQ in GHWk. The problem of checking whether a CQ is semantically in GHWk has been studied in the constraint-free case, and it has been shown to be NP-complete. However, in case the database is subject to constraints such as tuple-generating dependencies (TGDs) that can express, e.g., inclusion dependencies, or equality-generating dependencies (EGDs) that capture, e.g., key dependencies, a CQ may turn out to be semantically in GHWk under the constraints, while not being semantically in GHWk without the constraints. This opens avenues to new query optimization techniques. In this article, we initiate and develop the theory of semantic optimization of CQs under constraints. More precisely, we study the following natural problem: Given a CQ and a set of constraints, is the query semantically in GHWk, for a fixed k ≥ 1, under the constraints, or, in other words, is the query equivalent to one that belongs to GHWk over all those databases that satisfy the constraints? We show that, contrary to what one might expect, decidability of CQ containment is a necessary but not a sufficient condition for the decidability of the problem in question. In particular, we show that checking whether a CQ is semantically in GHW1 is undecidable in the presence of full TGDs (i.e., Datalog rules) or EGDs. In view of the above negative results, we focus on the main classes of TGDs for which CQ containment is decidable and that do not capture the class of full TGDs, i.e., guarded, non-recursive, and sticky sets of TGDs, and show that the problem in question is decidable, while its complexity coincides with the complexity of CQ containment. We also consider key dependencies over unary and binary relations, and we show that the problem in question is decidable in elementary time. Furthermore, we investigate whether being semantically in GHWk alleviates the cost of query evaluation. Finally, in case a CQ is not semantically in GHWk, we discuss how it can be approximated via a CQ that falls in GHWk in an optimal way. Such approximations might help finding “quick” answers to the input query when exact evaluation is intractable

Edinburgh Research Explorer

Oxford University Research Archive